Learning trees from strings: a strong learning algorithm for some context-free grammars

نویسنده

  • Alexander Clark
چکیده

Standard models of language learning are concerned with weak learning: the learner, receiving as input only information about the strings in the language, must learn to generalise and to generate the correct, potentially infinite, set of strings generated by some target grammar. Here we define the corresponding notion of strong learning: the learner, again only receiving strings as input, must learn a grammar that generates the correct set of structures or parse trees. We formalise this using a modification of Gold’s identification in the limit model, requiring convergence to a grammar that is isomorphic to the target grammar. We take as our starting point a simple learning algorithm for substitutable context-free languages, based on principles of distributional learning, and modify it so that it will converge to a canonical grammar for each language. We prove a corresponding strong learning result for a subclass of context-free grammars.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rigid Lambek Grammars Are Not Learnable from Strings

This paper is concerned with learning categorial grammars in Gold's model (Gold, 1967). Recently, learning algorithms in this model have been proposed for some particular classes of classical categorial grammars (Kanazawa, 1998). We show that in contrast to classical categorial grammars, rigid and k-valued Lambek grammars are not learnable from strings. This result holds for several variants of...

متن کامل

Canonical Context-Free Grammars and Strong Learning: Two Approaches

Strong learning of context-free grammars is the problem of learning a grammar which is not just weakly equivalent to a target grammar but isomorphic or structurally equivalent to it. This is closely related to the problem of defining a canonical grammar for the language. The current proposal for strong learning of a small class of CFGs uses grammars whose nonterminals correspond to congruence c...

متن کامل

Polynomial Time Learning of Some Multiple Context-Free Languages with a Minimally Adequate Teacher

We present an algorithm for the inference of some Multiple Context-Free Grammars from Membership and Equivalence Queries, using the Minimally Adequate Teacher model of Angluin. This is an extension of the congruence based methods for learning some Context-Free Grammars proposed by Clark (ICGI 2010). We define the natural extension of the syntactic congruence to tuples of strings, and demonstrat...

متن کامل

k-Valued Link Grammars are Learnable from Strings

The article is concerned with learning link grammars in the model of Gold. We show that rigid and k-valued link grammars are learnable from strings. In fact, we prove that the languages of link structured lists of words associated to rigid link grammars have finite elasticity and we show a learning algorithm. As a standard corollary, this result leads to the learnability of rigid or k-valued li...

متن کامل

Learning context-free grammars from stochastic structural information

We consider the problem of learning context-free grammars from stochastic structural data. For this purpose, we have developed an algorithm (tlips) which identiies any rational tree set from stochastic samples and approximates the probability distribution of the trees in the language. The procedure identiies equivalent subtrees in the sample and outputs the hypothesis in linear time with the nu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2013